learning deep embedding
Learning Deep Embeddings with Histogram Loss
We suggest a new loss for learning deep embeddings. The key characteristics of the new loss is the absence of tunable parameters and very good results obtained across a range of datasets and problems. The loss is computed by estimating two distribution of similarities for positive (matching) and negative (non-matching) point pairs, and then computing the probability of a positive pair to have a lower similarity score than a negative pair based on these probability estimates. We show that these operations can be performed in a simple and piecewise-differentiable manner using 1D histograms with soft assignment operations. This makes the proposed loss suitable for learning deep embeddings using stochastic optimization. The experiments reveal favourable results compared to recently proposed loss functions.
Reviews: Learning Deep Embeddings with Histogram Loss
The authors provide a new loss function for learning embeddings in deep networks, called histogram loss. This loss is based on a pairwise classification: whether two labels belong to the same class or not. In particular, the authors suggest to look at the similarity distribution of the embeddings on the L2 unit sphere (all embeddings are L2 normalized). The idea is to look at the distribution of the similar embedding (positive pairs) and the distribution of the non-similar ones (negative pairs) and make the probability that positive pairs has smaller score then negative pairs, smaller. After reviewing previous work in the area (Section 2), in Section 3 they develop a method how to estimate the Histogram loss.
Learning Deep Embeddings with Histogram Loss
Ustinova, Evgeniya, Lempitsky, Victor
We suggest a new loss for learning deep embeddings. The key characteristics of the new loss is the absence of tunable parameters and very good results obtained across a range of datasets and problems. The loss is computed by estimating two distribution of similarities for positive (matching) and negative (non-matching) point pairs, and then computing the probability of a positive pair to have a lower similarity score than a negative pair based on these probability estimates. We show that these operations can be performed in a simple and piecewise-differentiable manner using 1D histograms with soft assignment operations. This makes the proposed loss suitable for learning deep embeddings using stochastic optimization.